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Abstract 

In the field of privacy preserving data publishing, many privacy definitions have been proposed. 
Privacy definitions are like contracts that guide the behavior of an algorithm that takes in sensitive data 
and outputs non-sensitive sanitized data. In most cases, it is not clear what these privacy definitions 
actually guarantee. 

In this paper, we propose the first (to the best of our knowledge) general framework for extracting 
semantic guarantees from privacy definitions. These guarantees are expressed as bounds on the change 
in beliefs of Bayesian attackers. 

In our framework, we first restate a privacy definition in the language of set theory and then extract 
from it a geometric object called the row cone. Intuitively, the row cone captures all the ways an attacker's 
prior beliefs can be turned into posterior beliefs after observing an output of an algorithm satisfying that 
privacy definition. The row cone is a convex set and therefore has an associated set of linear inequalities. 
Semantic guarantees are generated by interpreting these inequalities as probabilistic statements. 

Our framework can be applied to privacy definitions or to individual algorithms to identify the types 
of inferences they prevent. In this paper we use our framework to analyze the semantic privacy guarantees 
provided by randomized response, FRAPP, and several algorithms that add integer-valued noise to their 
inputs. 

1 Introduction 

The ultimate goal of statistical privacy is to produce statistically useful sanitized data from sensitive datasets. 
It has two main research thrusts: developing/analyzing privacy definitions for protecting sensitive datasets, 
and designing algorithms that satisfy a given privacy definition while producing useful outputs. The algo- 
rithm design problem is well-posed and is the focus of most of the research activity. By contrast, privacy is 
a very subtle topic for which formalizing concepts is extremely challenging. 

Analysis of privacy is important when organizations prepare to release data. When choosing a privacy 
definition (which subsequently guides the design of an algorithm for producing sanitized data) , an organiza- 
tion is interested in questions such as the following. What classes of information does the privacy definition 
protect? Does it offer protections that the organization is interested in? Does it offer additional protections 
that are not necessary (meaning that the sanitized data will contain too much distortion)? What formal 
protections are provided by intuitive approaches to privacy that have been collected over the past 50 years 
[40]? 

In this paper we present the first (to the best of our knowledge) framework for extracting semantic 
guarantees from privacy definitions and individual algorithms. That is, instead of answering narrow questions 
such as "does privacy definition Y protect X?" the goal is to answer the more general question "what does 
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privacy definition Y protect?" This lets an organization judge whether a privacy definition is too weak or 
too strong for its needs. 

The framework can be used to extract guarantees about changes in the beliefs of computationally un- 
bounded Bayesian attackers. We apply our framework to several privacy definitions and algorithms for 
which we derived previously unknown privacy semantics - these include randomized response [40], FRAPP 
[1]/PRAM [20], and several algorithms that add integer- valued noise to their inputs. It turns out that their 
Bayesian semantic guarantees are a consequence of their ability to protect various notions of parity of a 
dataset. Since parity is frequently not a sensitive piece of information, we also show how privacy definitions 
can be relaxed when they are too strong. 

Currently our framework requires a certain level of mathematical skill from the user. Tools and method- 
ologies for reducing this burden are part of our future plans. For example, the large class of privacy definitions 
proposed by [28] are a direct consequence of this framework; they were specifically designed to bypass the 
difficult parts of the framework. 

The framework is based on a partial axiomatization of privacy [26, 25]. However, the only ideas we need 
from [26, 25] are two axioms and a anecdote about 2 specific privacy definitions that do not satisfy the 
privacy axioms 1 but which imply other privacy definitions that do. 

Given any privacy definition, the first step of the framework is to manipulate it using the two axioms to 
obtain a related privacy definition that we call the consistent normal form (the axioms essentially remove 
implicit assumptions in the original privacy definition). From the consistent normal form we extract an 
object called the row cone which, intuitively, captures all the ways in which an attacker's prior belief can 
be turned into a posterior belief after observing an output of an algorithm that satisfies the given privacy 
definition. Mathematically, the row cone is represented as a convex set and therefore has an associated 
collection of linear inequalities. We extract semantic guarantees by re-interpreting the coefficients of the 
linear inequalities as probabilities and re-interpreting the linear inequalities themselves as statements about 
probabilities. 

Our contributions are: 

• A novel framework that introduces the concepts of consistent normal form and row cone and uses them 
to extract semantic guarantees from privacy definitions. 

• Several applications of our framework, from which we extract previously unknown semantic guarantees 
for randomized response, FRAPP/PRAM, and several algorithms that add integer- valued noise to their 
inputs (including the Skellam distribution [38] and a generalization of the geometric mechanism [19]) . 

The remainder of the paper is organized as follows. We provide a detailed overview of our approach in 
Section 2. We discuss related work in Section 3. In Section 4, we review two privacy axioms from [26, 25] 
and then we show how to use them to obtain the consistent normal form, which removes some implicit 
assumptions from a privacy definition. Using the consistent normal form, we formally define the row cone 
(a fundamental geometric object we use for extracting semantic guarantees) in Section 4.2. In Section 5, 
we then apply our framework to extract new semantic guarantees for randomized response (Section 5.1), 
FRAPP/PRAM (Section 5.2), and noise addition algorithms (Section 5.3). We discuss relaxations of privacy 
definitions in Section 5.4 and present conclusions in Section 6. 

2 The Bird's-Eye View 

We first present some basic concepts in Section 2.1 and then provide a high-level overview of our framework 
in Section 2.2. 

1 For example, one axiom states that building a histogram from sanitized data and then releasing the histogram (instead of 
the sanitized data) is acceptable. This idea is widely accepted by designers of privacy definitions, yet many privacy definitions 
inadvertently fail to satisfy that axiom [26, 25]. 
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2.1 Basic Concepts 

Let I = {Di,D 2 , . . . } be the set of all possible databases. We now explain the roles played by data curators, 
attackers, and privacy definitions. 

The Data Curator owns a dataset D S I. This datasct contains information about individuals, business 
secrets, etc., and therefore cannot be published as is. Thus the data curator will first choose a privacy 
definition and then an algorithm 9Jt that satisfies this definition. The data curator will apply Wl to the data 
D and will then release its output (i.e. DJt(D)), which wc refer to as the sanitized output. We assume that 
the schema of D is public knowledge and that the data curator will disclose the privacy definition, release 
all details of the algorithm DJl (except for the specific values of the random bits it used), and release the 
sanitized output 9Jt(-D). 

The Attacker will use the information about the schema of D, the sanitized output DJl(D), and knowl- 
edge of the algorithm 9Jt to make inferences about the sensitive information contained in D. In our model, 
the attacker is computationally unbounded. The attacker may also have side information - in the literature 
this is often expressed in terms of a prior distribution over possible datasets € I. In this paper we are 
mostly interested in guarantees against attackers who reason probabilistically and so we also assume that 
an attacker's side information is encapsulated in a prior distribution. 

A Privacy Definition is often expressed as a set of algorithms that we trust (e.g., [40, 25]), or a set of 
constraints on how an algorithm behaves (e.g., [12]), or on the type of output it produces (e.g., [36]). Note 
that treating a privacy definition as a set of algorithms is the more general approach that unifies all of these 
ideas [25] - if a set of constraints is specified, a privacy definition becomes the set of algorithms that satisfy 
those constraints; if outputs in a certain form (such as fc-anonymous tables [36]) are required, a privacy 
definition becomes the set of algorithms that produce those types of outputs, etc. The reason that a privacy 
definition should be viewed as a set of algorithms is that it allows us to manipulate privacy definitions using 
set theory. 

Formally, a privacy definition is the set of algorithms with the same input domain that are trusted to 
produce nonsensitive outputs from sensitive inputs. We therefore use the notation *}3rirJ to refer to a privacy 
definition and 971 <G Cprio to mean that the algorithm 9)1 satisfies the privacy definition *}3ri0. 

The data curator will choose a privacy definition based on what it can guarantee about the privacy of 
sensitive information. If a privacy definition offers too little protection (relative to the application at hand) , 
the data curator will avoid it because sensitive information may end up being disclosed, thereby causing 
harm to the data curator. On the other hand, if a privacy definition offers too much protection, the resulting 
sanitized data may not be useful for statistical analysis. Thus it is important for the data curator to know 
exactly what a privacy definition guarantees. 

The Goal is to determine what guarantees a privacy definition provides. In this paper, when we discuss 
semantic guarantees, we are interested in the guarantees that always hold regardless of what sanitized output 
is produced by an algorithm satisfying that privacy definition. We focus on computationally unbounded 
Bayesian attackers and look for bounds on how much their beliefs change after seeing sanitized data. It is 
important to note that the guarantees will depend on assumptions about the attacker's prior distribution. 
This is necessary, since it is well-known that without any assumptions, it is impossible to preserve privacy 
while providing useful sanitized data [15, 27, 28]. 

2.2 Overview 

In a nutshell, our approach is to represent deterministic and randomized algorithms as matrices (with possibly 
infinitely many rows and columns) and to represent privacy definitions as sets of algorithms and hence as 
sets of matrices. If our goal is to analyze only a single algorithm, we simply treat it as a privacy definition 
(set) containing just one algorithm. The steps of our framework then require us to normalize the privacy 
definitions to remove some implicit assumptions (we call the result the consistent normal form), extract 
the set of all rows that appear in the resulting matrices (we call this the row cone), find linear inequalities 
describing those rows, reinterpret the coefficients of the linear inequalities as probabilities, and reinterpret 
the inequalities themselves as statements about probabilities to get semantic guarantees. In this section, we 
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Figure 1: The matrix representation of 971. Columns are indexed by datasets £ domain(97t) and rows are 
indexed by outputs £ range (971). 

describe these steps in more detail and defer a technical exposition of the consistent normal form and row 
cone to Section 4. 

2.2.1 Algorithms as matrices. 

Since our approach relies heavily on linear algebra, it is convenient to represent algorithms as matrices. 
Every algorithm 971, randomized or deterministic, that runs on a digital computer can be viewed as a matrix 
in the following way. An algorithm has an input domain I = {Z?i, D 2 , . . . } consisting of datasets D i: and 
a range {u>i, u> 2 , ■ ■ ■ }■ The input domain I and range(97t) are necessarily countable because each Dj 6 I 
and uij £ range(97t) must be encoded as finite bit strings. The probability P(97t(Dj) = uij) is well defined 
for both randomized and deterministic algorithms. The matrix representation of an algorithm is defined as 
follows (see also Figure 1). 

Definition 2.1 (Matrix representation of 971). Let 9JI be a deterministic or randomized algorithm with 
domain I = {D\, D 2l ■ ■ ■ } and range {u>i, uj 2 , . . . }. The matrix representation ofDJlisa (potentially infinite) 
matrix whose columns are indexed by I and rows are indexed by range(97t). The value of each entry is 
the quantity P(9Jl(Dj) = tOi). 

2.2.2 Consistent Normal Form of Privacy Definitions. 

Recall from Section 2.1 that we take the unifying view that a privacy definition is a set of algorithms (i.e., 
the set of algorithms that satisfy certain constraints or produce certain types of outputs). 

Not surprisingly, there are many sets of algorithms that do not meet common expectations of what a 
privacy definition is [25] . For example, suppose that we decide to trust an algorithm 971 to generate sanitized 
outputs from the sensitive input data D. Suppose we know that a researcher wants to run algorithm A on 
the sanitized data to build a histogram. If we are willing to release the sanitized output DJl(D) publicly, 
then we should also be willing to release A(9Jl(D)). That is, if we trust 971 then we should also trust Ao 97t 
(the composition of the two algorithms). In other words, if 971 £ 93ri0, for some privacy definition tyxiv, then 
.4o97t should also be in *|5citi. 

Many privacy definitions in the literature do not meet criteria such as this [25]. That is, 971 may explicitly 
satisfy a given privacy definition but _4o97t may not. However, since the output of 971 is made public and 
anyone can run A on it, these privacy definitions come with the implicit assumption that the composite 
algorithm A o 97t should be trusted. 

Thus, given a privacy definition CPrit), we first must expand it to include all of the algorithms we should 
trust (via a new application of privacy axioms) . The result of this expansion is called the consistent normal 
form and is denoted by CNF(^Prio). Intuitively, CNF(*Prio) is the complete set of algorithms we should trust 
if we accept the privacy definition *prit) and the privacy axioms. We describe the consistent normal form in 
full technical detail in Section 4.1. 
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Figure 2: An example of a row cone (shaded) and its defining linear inequalities. 

2.2.3 The Row Cone 

Recall that we represent algorithms as matrices (Definition 2.1) and privacy definitions as sets of algorithms. 
Therefore CNFCPrio) is really a set of matrices. The row cone of *}Mo, denoted by rowcone(*Ptio), is the 
set of vectors of the form cx where c > and a; is a row of a matrix corresponding to some algorithm 
9Jt G CNF(^rio). 

How does the row cone capture the semantics of *PriO? Suppose 9Jt G CNF(^rit)) is one of the algorithms 
that we trust. Let D be the true input dataset and let uj — D)l(D) be the sanitized output that we publish. 
A Bayesian attacker who sees output uj and is trying to derive sensitive information will need to compute 
the posterior distribution P(data = Di | 9Jt(data) = uj) for all datasets Di. This posterior distribution is a 
function of the attacker's prior P(data = Di) and the vector of probabilities: 

[P(9Jtpi) = uj), P(M(D 2 )=uj) 1 ...} 

This vector belongs to rowcone(^rit)) because it corresponds to some row of the matrix representation of 
9Jt (i.e., the row associated with output uj). Note that multiplying this vector by any positive constant will 
leave the attacker's posterior beliefs unchanged. The row cone is essentially the set of all such probability 
vectors that the attacker can ever see if we use a trusted algorithm (i.e. something belonging to CNF(*Prio)); 
therefore it determines all the ways an attacker's beliefs can change (from prior to posterior). 

Thus constraints satisfied by the row cone are also constraints on how prior probabilities could be turned 
into posterior probabilities. In Figure 2 we illustrate a row cone in 2 dimensions (i.e. the input domain 
consists of only 2 datasets). Each vector in the row cone is represented as a point in 2-d space. Later in the 
paper, it will turn out that the row cone is always a convex set and hence has an associated system of linear 
inequalities (corresponding to the intersection of halfspaces containing the row cone) as shown in Figure 2. 

2.2.4 Extracting Semantic Guarantees From the Row Cone 

The row cone is a convex set (in fact, a convex cone) and so satisfies a set of linear inequalities having the 
forms [4]: 

AxP(fm(Di) = uj) + A 2 P(Wl{D 2 ) = uj) + . . . > or 
A l P{m{Di)=ui)+A 2 P{m{D 2 )=uj) + ... = Oor 
i4iP(3Jl(Di) = uj) + A 2 P(Wl{D 2 ) = uj) + . . . > 

that must hold for all trusted algorithms 971 G CNF(*}5rio) and sanitized outputs uj G range(37l) they can 
produce. The key insight is that we can re-interpret the magnitude of the coefficients \A\\, \ A 2 \, ... of these 
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linear inequalities as probabilities (dividing by \Ai \ + + •■■ if necessary) and then rc-intcrpret the linear 
inequalities as statements about prior and posterior probabilities of an attacker. We give a detailed example 
in Section 5.1, where we apply our framework to randomized response. The semantic guarantees we extract 
then have the form: "if the attacker's prior belongs to set X then here are restrictions on the posterior 
probabilities the attacker can form" (note that avoiding any assumptions on prior probabilities/knowledge 
is not possible if the goal is to release even marginally useful sanitized data [15, 27, 28]). 

3 Related Work 

3.1 Evaluating Privacy 

Research in statistical privacy mainly focuses on developing privacy definitions and algorithms for publishing 
sanitized data (i.e., nonsensitive information) derived from sensitive data. To the best of our knowledge, 
this paper provides the first framework for extracting semantic guarantees from privacy definitions. Other 
work on evaluating privacy definitions looks for the presence or absence of specific vulnerabilities in privacy 
definitions or sanitized data. 

In the official statistics community, re-identification experiments are performed to assess whether indi- 
viduals can be identified from sanitized data records [6]. In many such experiments, software is used to link 
sanitized data records to the original records [43] . Rciter [34] provides a detailed example of how to apply 
the decision-theoretic framework of Duncan and Lambert [11] to measure disclosure risk. There are many 
other methods for assessing privacy for the purposes of official statistics; for surveys, see [6, 41, 42]. 

Other work in statistical privacy seeks to identify and exploit specific types of weaknesses that may 
be present in privacy definitions. Dwork and Naor [15] formally proved that it is not possible to publish 
anonymized data that prevents an attacker from learning information about people who are not even part 
of the data unless the anonymized data has very little utility or some assumptions are made about the 
attacker's background knowledge. Lambert [29] suggests that harm can occur even when an individual is 
linked to the wrong anonymized record (as long as the attacker's methods are plausible). Thus one of the 
biggest themes in privacy is preventing an attacker from linking an individual to an "anonymized" record [9] , 
possibly using publicly available data [39] or other knowledge [32]. Dinur and Nissim [10] and later Dwork 
ct al. [14] showed fundamental limits to the amount of information that can be released even under very 
weak privacy definitions (information-theorctically and computationally [16]). These attacks generally work 
by removing noise that was added in the sanitization process [22, 21, 31]. Ganta et al. [18] demonstrated a 
composition attack where independent anonymized data releases can be combined to breach privacy; thus a 
desirable property of privacy definitions is to have privacy guarantees degrade gracefully in the presence of 
multiple independent releases of sanitized data. The minimality attack [44] showed that privacy definitions 
must account for attackers who know the algorithm used to generate sanitized data; otherwise the attackers 
may reverse- engineer the algorithm to cause a privacy breach. The de Finetti attack [24] shows that privacy 
definitions based on statistical models are susceptible to attackers who make inferences using different models 
and use those inferences to undo the anonymization process; thus it is important to consider a wide range of 
inference attacks. Also, one should consider the possibility that an attacker may be able to manipulate data 
(e.g. by creating many new accounts in a social network) prior to its release to help break the subsequent 
anonymization of the data [3]. Note also that privacy concerns can also be associated with aggregate 
information such as trade secrets (and not just rows in a table) [7, 28]. 

3.2 Privacy Definitions 

In this section, we review some privacy definitions that will be examined in this paper. 
3.2.1 Syntactic Privacy Definitions 

A large class of privacy definitions places restrictions on the format of the output of a randomized algorithm. 
Such privacy definitions are known as syntactic privacy definitions. The prototypical syntactic privacy 
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definition is fc-anonymity [36, 39]. In the anonymity model, a data curator first designates a set of attributes 
to be the quasi-identifier. An algorithm 9JT then satisfies fc-anonymity if its input is a table T and its output 
is another table T* that is k- anonymous - for every tuple in T*, there are k — 1 other tuples that have the 
same value for the quasi-identifier attributes [36, 39]. Algorithms satisfying k- anonymity typically work by 
generalizing (coarsening) attribute values. For example, if the data contains an attribute representing the age 
of a patient, the algorithm could generalize this attribute into age ranges of size 10 (e.g., [0 — 9] , [10— 19] , etc.) 
or ranges of size 20, etc. Quasi-identifier attributes are repeatedly generalized until a table T* satisfying k- 
anonymity is produced. The rationale behind fc-anonymity is that quasi-identifier attributes may be recorded 
in publicly available datasets. Linking those datasets to the original table T may allow individual records to 
be identified, but linking to the fc-anonymous table T* will not result in unique matches. 

3.2.2 Randomized Response 

Randomized response is a technique developed by Warner [40] to deal with privacy issues when answering 
sensitive questions in a face-to-face survey. There are many variations of randomized response. One of the 
most popular is the following: a respondent answers truthfully with probability p and lies with probability 
(1 — p), thus ensuring that the interviewer is not certain about the respondent's true answer. Thus the 
scenario where we can apply randomized response is the following: the input table T contains 1 binary 
attribute and k tuples. We can apply randomized response to T by applying the following procedure to 
each tuple: flip the binary attribute with probability 1 — p. The perturbed table, which we call T*, is then 
released. Note that randomized response is a privacy definition that consists of exactly one algorithm: the 
algorithm that flips each bit independently with probability l~p. We use our framework to extract semantic 
guarantees for randomized response in Section 5.1. 

3.2.3 PRAM and FRAPP 

PRAM [20] and FRAPP [1] are generalizations of randomized response to tables where tuples can have more 
than one attribute and the attributes need not be binary. PRAM can be thought of as a set of algorithms 
that independently perturb tuples, while FRAPP is an extension of PRAM that adds formally specified 
privacy restrictions to these perturbations. 

Let TUT be the domain of all tuples. Each algorithm 9Jtg satisfying PRAM is associated with a transition 
matrix Q of transition probabilities, where the entry Q b<a is the probability P(a — > b) that the algorithm 
changes a tuple with value a e TUV to the value b e TUV. Given a dataset D = \t\, . . . , t n }, the algorithm 
DJIq assigns a new value to the tuple t\ according to the transition probability matrix Q, then it independently 
assigns a new value to the tuple t 2 , etc. It is important to note that the matrix representation of WIq (as 
discussed in Section 2.2.1) is not the same as the transition matrix Q. As we will discuss in Section 5.2, 
the relationship between the two is that the matrix representation of 9Jtg is equal to ® n Q, where (J) is the 
Kronecker product. 

FRAPP, with privacy parameter 7, imposes a restriction on these algorithms. This restriction, known 
as 7-amplification [17], requires that the transition matrices Q satisfy the constraints ^S^ 2 - < 7 for all 

a, b, c e TUV. This condition can also be phrased as p| e _ >a | ^ 7- 

3.2.4 Differential Privacy 

Differential privacy [12, 13] is defined as follows: 

Definition 3.1. A randomized algorithm DJl satisfies e-differential privacy if for all pairs of databases T\, T 2 
that differ only in the value of one tuple and for all sets S, P(9Jt(7i) eS)< e e P(M(T 2 ) G S). 

Differential privacy guarantees that the sanitized data that is output has little dependence on the value 
of any individual's tuple (for small values of e). It is known to be a weaker privacy definition than random- 
ized response. Using our framework, we show in Section 5.1.1 that the difference between the two is that 
randomized response provides additional protection for the parity of every subset of the data. 
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4 Consistent Normal Form and the Row Cone 



In this section, we formally define the consistent normal form CNF(<Ptit>) and rowcone(^Mo) of a privacy 
definition Cprio and derive some of their important properties. These properties will later be used in Section 
5 to extract novel semantic guarantees for randomized response, FRAPP/PRAM, and for several algorithms 
(including the geometric mechanism [19]) that add integer random noise to their inputs. 

4.1 The Consistent Normal Form 

Recall that we treat any privacy definition as the set of algorithms with the same input domain. For 
example, we view fc-anonymity as the set of all algorithms that produce fc-anonymous tables [36]. As noted 
in [25], such a set can often have inconsistencies. For example, consider an algorithm 9JI that first transforms 
its input into a fc-anonymous table and then builds a statistical model from the result and outputs the 
parameters of that model. Technically, this algorithm SOT does not satisfy fc-anonymity because "model 
parameters" are not a "fc-anonymous table." However, it would be strange if the data curator decided that 
releasing a fc-anonymous table was acceptable but releasing a model built solely from that table (without 
any side information) was not acceptable. The motivation for the consistent normal form is that it makes 
sense to enlarge the set Cptio by adding Wl into this set. 

It turns out that privacy axioms can help us identify the algorithms that should be added. For this 
purpose, we will use the following two axioms from [25]. 

Axiom 4.1 (Post-processing [25]). Let *priO be a privacy definition (set of algorithms). Let 9JT £ *}Mo and 
let A be any algorithm whose domain contains the range o/97l and whose random bits are independent of the 
random bits of 9)1. Then the composed algorithm Aofflt (which first runs 9Jt and then runs A on the result) 
should also belong to *}3rit). 2 

Note that Axiom 4.1 prevents algorithm A from using side information since its only input is 9Jl(D). 

Axiom 4.2 (Convexity [25]). Let be a privacy definition (set of algorithms). Let SOTi £ *}Mo and 

DJI2 £ *PriO be two algorithms satisfying this privacy definition. Define the algorithm choice^ m2 to be the 
algorithm that runs DJli with probability p and OT 2 with probability 1 — p. Then choice^ m2 should belong 
to *PriO. 

The justification in [25] for the convexity axiom (Axiom 4.2) is the following. If both dJli and SOI2 belong 
to *priO, then both are trusted to produce sanitized data from the input data. That is, the outputs of SDTi and 
leave some amount of uncertainty about the input data. If the data curator randomly chooses between 
9Jti and 9JT2, the sensitive input data is protected by two layers of uncertainty: the original uncertainty 
added by either 9Jti or < XSl 2 and the uncertainty about which algorithm was used. Further discussion can be 
found in [25]. 

Using these two axioms, we define the consistent normal form as follows: 3 

Definition 4.3. (CNF). Given a privacy definition tyxin, its consistent normal form, denoted by CNF(*}3tit>), 
is the smallest set of algorithms that contains Vpntt and satisfies Axioms J^.l and 4-2. 

Essentially, the consistent normal form uses Axioms 4.1 and 4.2 to turn implicit assumptions about which 
algorithms we trust into explicit statements - if we are prepared to trust any 9JT £ <JMt) then by Axioms 4.1 
and 4.2 we should also trust any ffl £ CNF(^Mt)). The set CNF(^tit)) is also the largest set of algorithms 
we should trust if we are prepared to accept ^rit) as a privacy definition. 

The following theorem provides a useful characterization of CNF(<Ptit>) that will help us analyze privacy 
definitions in Section 5. 

2 Note that if and OT2 are algorithms with the same range and domain such that P(9Jli(Di) = ui) = P(SJt2(-Di) = u>) for 
all Di £ I and u> £ range(9Jti), then we consider 2Jti and to be equivalent. 

3 Note that this is a more general and useful idea than the observation in [25] that 2 specific variants of differential privacy 
do not satisfy the axioms but do imply a third variant that does satisfy the axioms. 
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Theorem 4.4. Given a privacy definition Cprio, its consistent normal form CNF(Cptio) is equivalent to the 
following. 

1. Define CpriD*- 1 ^ to be the set of all (deterministic and randomized algorithms) of the form AoOJt, where 
971 G ^PriO, rangc(97t) C domain(.4) ; and the random bits of A and 97t are independent of each other. 

2. For any positive integer n, finite sequence 97ti, . . . , 97t„ and probability vector p = (pi, . . . ,p„), use 
the notation choice p (97ti, . . . , 97t„) to represent the algorithm that runs 9Jlj with probability pi. Define 
^rio*- 2 ' to be the set of all algorithms of the form choice^ (97ti , . . . , 97t„) where n is a positive integer, 
97Ti, . . . , 97t n G ^kit/ 1 ), and p is a probability vector. 

3. Set CNF(«Ptit)) = «prit> (2) . 

Proof. See Appendix A. □ 

Corollary 4.5. //^Prit) = {9)1} consists of just one algorithm, CNF(*}Mt>) is the set of all algorithms of the 
form Ao9Jl, where rangc(97t) C domain(^l) and the random bits in A and 971 are independent of each other. 

Proof. See Appendix B. □ 



4.2 The Row Cone 

Having motivated the row cone in Section 2.2.3, we now formally define it and derive its basic properties. 

Definition 4.6 (Row Cone). Let I = {Di,D 2 , ...} be the set of possible input datasets and let Cprio be a 
privacy definition. The row cone o/CpriO, denoted by rowcone(*prio) ; is defined as the set of vectors: 

(c*P[M(D 1 ) = u], c*P[m{D 2 ) =w],...) : c>0, M G CNF (<Prit>), w G range (OT) | 

Recalling the matrix representation of algorithms (as discussed in Section 2.2.1 and Figure 1), we see that 
a vector belongs to the row cone if and only if it is proportional to some row of the matrix representation of 
some trusted algorithm 971 € CNF(*}Mt)). 

Given a 971 e CNF(*}Mt>) and lo G rangc(97t), the attacker uses the vector (P[97l(Z7 1 ) = u>], P[97t(D 2 ) = 
u>], . . .) G rowcone( < Pciti) to convert the prior distribution P(data = Di) to the posterior P(data = Di \ 97t(data) = 
lu). Scaling this likelihood vector by c > does not change the posterior distribution, but it docs make it 
easier to work with the row cone. 

Constraints satisfied by rowcone(93rio) are therefore constraints shared by all of the likelihood vectors 
(P[97T(-Di) = uj], P[97t(P 2 ) = w], . . . ) G rowcone(^3rio) and therefore they constrain the ways an attacker's 
beliefs can change no matter what trusted algorithm 97t G CNF(Cprio) is used and what sanitized output 
to G rangc(97t) is produced. 

The row cone has an important geometric property: 

Theorem 4.7. rowcone(93rit>) is a convex cone. 

Proof See Appendix C. □ 

The fact that the row cone is a convex cone means that it satisfies an associated set of linear constraints 
(from which we derive semantic privacy guarantees). For technical reasons, the treatment of these constraints 
differs slightly depending on whether the row cone is finite dimensional (which occurs if the number of possible 
datasets is finite) or infinite dimensional (if the set of possible datasets is countably infinite). We discuss 
this next. 
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4.2.1 Finite dimensional row cones. 



A closed convex set in finite dimensions is expressible as the solution set to a system of linear inequalities 
[4]. When the row cone is closed then the linear inequalities have the form: 

A 1>1 P[fm(D 1 )=u] + ---+A 1>n P[fm(D n )=u] > 
A 2 ,iP[9Jl(Di) = u] + ■ ■ ■ + A 2 , n P[9Jl(D n ) = u] > 



(with possibly some equalities of the form B\P[9Jl(Di) = uj] + - ■ ■ + B n P[9Jl(D n ) = lo] = thrown in). When 
the row cone is not closed, it is still well-approximated by such linear inequalities: their solution set contains 
the row cone, and the row cone contains the solution set when the '>' in the constraints is replaced with 
'>'. 



4.2.2 Infinite dimensional row cones. 

When the domain of the data is countably infinite 4 , vectors in the row cone have infinite length since there 
is one component for each possible dataset. The vectors in the row cone belong to the vector space 4» ; the 
set of vectors whose components are bounded. Linear constraints in this vector space can have the form: 

A 1 P[m(D 1 ) =uj}+ A 2 P[m(D 2 ) = u] + ■ ■ ■ > (1) 

(where ^ \ Ai\ < oo) 

but, if one accepts the Axiom of Choice, linear constraints are much more complicated and are generally 
defined via finitely additive measures [37]. On the other hand, in constructive mathematics 5 , such more 
complicated linear constraints cannot be proven to exist ([37], Sections 14.77, 23.10, and 27.45, and [30]). 
Therefore we only consider the types of linear constraints shown in Equation 1. 



4.2.3 Interpretation of linear constraints. 

Starting with a linear inequality of the form AiP(DJl(Di) = uj) + A 2 P(£ft(£> 2 ) = w) H > 0, we can separate 

out the positive coefficients, say A il , A i2 , . . . , from the negative coefficients, say A^ , A^ , . . . , to rewrite it in 
the form: 

A n P(<m(D n ) =uj)+ A l2 P(M(D t2 ) = u) + ■ ■ ■ > \Aj>\ P(9Jt(A;) = w) + 1^1 ^WA^) = w) + . . . 

where all of the coefficients are now positive. We can view each Ai j as a possible value for the prior 
probability P(data = Di.) (or a value proportional to a prior probability). Setting Si = {D il7 D i2 , . . .} and 
S 2 = {-Dji> D^, . . . }. This allows us to interpret the linear constraints as statements such as aP(data e 
Si,9Jt(data) = to) > P(data e S2, fJJt(data) = to). Further algebraic manipulations (and a use of constants 
independent of 9JI) result in statements such as: 

P(data e S 2 I OJt(data) = u) 

01 ~ P(data i Si I SJl(data) = u) ( ' 

, P(data e S 2 I SDt(data) = u) /P(data e S 2 ) 

a ~ P(datae Si I 9Jt(data) =w)/ P(data e Si) ( ' 

Equation 2 means that if an attacker uses a certain class of prior distributions then after seeing the sanitized 
data, the probability of some set S 2 is no more than a times the probability of some set Si. Equation 3 



4 We need not consider uncountably infinite domains since digital computers can only process finite bit strings, of which 
there are countably many. 

5 More precisely, mathematics based on Zermelo-Fracnkel set theory plus the Axiom of Dependent Choice [37] 
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means that if an attacker uses a certain class of priors, then the relative odds of S2 vs. Si can increase by 
at most a' after seeing the sanitized data 6 . 

Of particular importance are the sets Si and S2 of possible input datasets, whose relative probabilities 
are constrained by the privacy definition. In an ideal world they would correspond to something wc are 
trying to protect (for example, Si could be the set of databases in which Bob has cancer and S 2 could be 
the set of databases in which Bob is healthy). If a privacy definition is not properly designed, Si and S 2 
could correspond to concepts that may not need protection for certain applications (for example, Si could 
be the set of databases with even parity and S2 could be the set of databases with odd parity). In any case, 
it is important to examine existing privacy definitions and even specific algorithms to see which sets they 
end up protecting. 

5 Applications 

In this section, we present the main technical contributions of this paper - applications of our framework for 
the extraction of novel semantic guarantees provided by randomized response, FRAPP/PRAM, and several 
algorithms (including a generalization of the geometric mechanism [19]) that add integer- valued noise to 
their inputs. We show randomized response and FRAPP offer particularly strong protections on different 
notions of parity of the input data. Since such protections arc often unnecessary, we show, in Section 5.4, 
how to manipulate the row cone to relax privacy definitions. 

We will make use of the following theorem which shows how to derive CNF(*}Mt>) and rowcone(*}Mu) for 
a large class of privacy definitions that are based on a single algorithm. 

Theorem 5.1. Let I be a finite or countably infinite set of possible datasets. Let 97T be an algorithm with 
domain(9jr) = I. Let M* be the matrix representation o/9Jt* (Definition 2.1). If (M*)^ 1 exists and the Li 
norm of each column of (M*)" 1 is bounded by a constant C then 

(1) A bounded row vector x <G rowconc({9Jt*}) if and only if x ■ m > for every column m of (M*) _1 . 

(2) An algorithm 971, with matrix representation M, belongs to CNF({9Jt*}) if and only if the matrix 
M(M*) _1 contains no negative entries. 

(3) An algorithm DJl, with matrix representation M, belongs to CNF({9Jt*}) if and only if every row of M 
belongs to rowcone({97t*}). 

Proof. See Appendix D. □ 

Note that one of our applications, namely the study of FRAPP/PRAM, does not satisfy the hypothesis 
of this theorem as it is not based on a single algorithm. Nevertheless, this theorem still turns out to be 
useful for analyzing FRAPP/PRAM. 

5.1 Randomized Response 

In this section, we apply our framework to extract Bayesian semantic guarantees provided by randomized 
response. Recall that randomized response applies to tables with k tuples and a single binary attribute. 
Thus each database can be represented as a bit string of length k. We formally define the domain of datasets 
and the randomized response algorithm as follows. 

Definition 5.2 (Domain of randomized reponse). Let the input domain I = {-Di, . . . , D 2 k} be the set of 
all bit strings of length k. The bit strings are ordered in reverse lexicographic order. Thus D\ is the string 
whose bits are all 1 and D 2 k is the string whose bits are all 0. 

6 In fact, that idea has led to the creation of a large class of privacy definitions [28] as a followup to this framework; the 
linear constraints that characterize privacy definitions in [28] are precisely the constraints of what we here call the row cone, 
hence all the difficult parts of the framework have been bypassed in [28]. 
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Definition 5.3 (Randomized response algorithm). Given a privacy parameter p £ [0,1], let 9Jt rr ( p ) be the 
algorithm that, on input D £ I, independently flips each bit of D with probability \—p. 

For example, when k — 2 then 1 1 1 = 4 and the matrix representation of 9Jt rr ( p ) is 

Di = 11 D 2 = 10 D s = 01 D 4 = 00 
wi = ll/ p 2 p(l-p) p(l-p) (1-P) 2 \ 



p(l-p) p 2 (1-p) 2 P(l-P) 
p(l-p) (1-p) 2 P 2 P(l-P) 



o; 2 = 10 
oj 3 = 01 

o; 4 = 00 V(l-P) 2 P(l-P) P(l-P) P 2 / 



Note that randomized response, as a privacy definition, is equal to {97l rr ( p )}. The next lemma says that 
without loss of generality, we may assume that p > 1/2. 

Lemma 5.4. Given a privacy parameter p, define q = max(p, 1 — p). Then 

. CNF({OT rr(p) }) = CNF({97l rr(9) }). 

• If p =1/2 then CNF({9Jt rr ( p )}) consists of the set of algorithms whose outputs are statistically indepen- 
dent of their inputs (i.e. those algorithms 9JI where P[9Jl(Di) = oj] = P[9Jl(Dj) = oj] for all Di,Dj £ I 
and uj G rangc(9Jt) ), and therefore attackers learn nothing from those outputs. 

Proof. See Appendix E. □ 

Therefore, in the remainder of this section, we assume p > 1/2 without loss of generality. Now we derive 
the consistent normal form and row cone of randomized response. 

Theorem 5.5 (CNF and rowcone). Given input space I = {D\, .... D 2 k} of bit strings of length k and a 
privacy parameter p > 1/2, 

• A vector x = (xi, . . . , x 2 k) £ rowcone({9Jt J , r( -p-)}) if and only if for every bit string s of length k, 

i=l 

where ham(s,Z?i) is Hamming distance between s and Di. 

• An algorithm DJl with matrix representation M belongs to CNF({9Jt rr ( p )}) if and only if every row of M 
belongs to rowcone({9Jt rr .( p )}). 

Proof. See Appendix F. □ 

We illustrate this theorem with our running example of tables with k = 2 tuples. 

Example 5.6. (CNF of randomized response, k — 2). Let p > 1/2. With 2 tuples and one binary attribute, 
the domain I = {11,10,01,00}. An algorithm 9H with matrix representation M belongs to the CNF of 
randomized response (with privacy parameter p) if for every vector x = (jch, X\o, Xqi, Xoq) that is a row of 
M , the following four constraints hold: 

p 2 x o + (1 - p) 2 xn > p(l - p)x i + p(l - p)x w (4) 

(1 - p) 2 :roo + p 2 xn > p(l - p)x i + p(l - p)x 10 (5) 

p 2 a;oi + (1 - p) 2 ^io > p(l - p)xoo + p(l - p)a?n (6) 

(1 - p) 2 a;oi + p 2 x 10 > p(l - p)xoo + p(l - p)xn (7) 
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We use Example 5.6 to explain the intuition behind the process of extracting Bayesian semantic guarantees 
from the row cone of randomized response, as given by the constraints in Equations 4, 5, 6, and 7. Let us 
consider the following three attackers. 

Attacker 1. This attacker has the prior beliefs that P(data = 11) = p 2 , P(data = 00) = (1 — p) 2 and 
P(data = 01) = P(data = 10) = p(l — p), so that each bit is independent and equals 1 with probability p 
(this p is the same as the privacy parameter p in randomized response). Let us consider the effect of the 
constraint in Equation 4 on the attacker's inference. This constraint says that for all 97t in the CNF of 
randomized response and for all to € range (971), 

p 2 P[9Jt(ll) = co] + (1 -p) 2 P[»T(00) = co] > p(l-p)P[art(01)=w]+p(l-p)P[aJl(10)=w] (8) 

Note that the coefficients in the linear constraints have the same values as the prior probabilities of the 
possible input datasets. Substituting those prior beliefs into Equation 8, we get the constraint that for all 
co e rangc(97t): 

P(data = ll)P[9Jt(ll) = co] + P(data = 00)P[9Jl(00) = co] > P(data = 01)P[9Jt(01) = co] + P(data = 10)P[97t(10) = co] 

which in turn is equal to the constraint on the attacker's belief about the joint distribution of the input and 
output of SOT: 

P[parity(data) = A 97t(data) = co] > P [parity (data) = 1 A Ort(data) = co] 

Dividing both sides by P(97t(data) = co) (where data is a random variable), we get the following constraints 
that 97t imposes on the attacker's posterior distribution: 

P[parity(data) = | Ott(data) = co] > P [parity (data) = 1 | SOt(data) = co] 

Thus 971 guarantees that if an attacker believes that bits in the database are generated independently with 
probability p, then after seeing the sanitized output, the attacker will believe that the true input is more 
likely to have even parity. Also, note that the attacker's prior belief about even parity (which is p 2 + (1 — p) 2 ) 
is greater than the attacker's prior belief about odd parity (which is 2p(l — p)). Therefore 971 guarantees 
that the attacker will not change his mind about which parity, even or odd, is more likely. 

Attacker 2. Now consider a different attacker who believes that the first bit in the true database is 1 with 
probability 1 — p and the second bit is 1 with probability p (both bits are still independent). Then, by similar 
calculations, Equation 6, implies that for this attacker 

P[parity(data) = 1 | OT(data) = co] > P [parity (data) = | SOt(data) = co] 

Thus, after seeing any sanitized output, the attacker will believe that the true input was more likely to have 
odd parity than even parity. This attacker's prior belief about odd parity (which is p 2 + (1 — p) 2 ) is greater 
than this attacker's prior belief about even parity (which is 2p(l — p)). Thus again, any 971 in the CNF of 
randomized response will ensure that the attacker will not change his mind about the which parity is more 
likely. 

Attacker 3. This attacker believes that the first bit is 1 with probability 1/2 and believes the second bit 
is 1 with probability p (the bits are independent of each other). In this case, the attacker's prior beliefs are 
that odd parity and even parity are equally likely. It is easy to see that now the output of 971 can make 
the attacker change his mind about which parity is more likely (for example, consider what happens when 
97t rr ( p ) outputs 01 or 00). This is true because the attacker was so unsure about parity that even the slightest 
amount of evidence can change his beliefs about which parity is (slightly) more likely. However, the attacker 
will not change his mind about the parity of the second bit, for which he has greater confidence. This result 
is a consequence of Theorem 5.7 below, which formally presents the semantic guarantees of randomized 
response. 

The difference between Attacker 3 and Attackers 1, 2 is that Attacker 3 expressed the weakest prior 
preference between even and odd parity (i.e. 1/2 vs. 1/2). Attackers 1 and 2 had stronger prior beliefs 
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about which parity is more likely and as a result randomized response guarantees that they will not change 
their minds about which parity is more likely. 

The following theorem generalizes these observations to show that randomized response protects the 
parity of any set of bits whose prior probabilities are > p or < 1 — p (where p is the privacy parameter) . 
It also shows that the only algorithms that have this property are the ones that belong to the trusted set 
CNF({97t rr ( p )}). Also note that, by Theorem 5.5, an algorithm 971 with matrix representation M belongs 
to CNF({9Jl rr ( p )}) if and only if every row of M belongs to rowcone({9Jl rr ( p )}). Thus the following theorem 
completely characterizes the privacy guarantees provided by randomized response. 7 

Theorem 5.7. Let p be a privacy parameter and let I = D\,... ,D 2 k. Let SOT be an algorithm that has 
a matrix representation whose every row belongs to the row cone of randomized response. If the attacker 
believes that the bits in the data are independent and bit i is equal to 1 with probability q i; then 9JI protects 
the parity of any subset of bits that have prior probability > p or < 1—p. That is, for any subset {l\ , . . . , £ m } 
of bits of the input data such that qi j > p V qi j < 1 — p for j = 1, . . . , m, the following holds: 

• If P(parity( J) = 0) > P(parity(J) = 1) then P(parity(J) = | OJt(data)) > P(parity(J) = 1 | OJT(data)) 

• // P(parity(J) = 1) > P(parity(J) = 0) then P(parity(J) = 1 | OJt(data)) > P(parity(J) = | OTt(data)) 

Furthermore, an algorithm 9JI can only provide these guarantees if every row of its matrix representation 
belongs to rowcone({9Jl rr ( p )}). 

Proof. See Appendix G. □ 

In many cases, protecting the parity of an entire dataset is not necessary in privacy preserving applications 
(in fact, some people find it odd). 8 Using the row cone, it is possible to relax a privacy definition to get rid 
of such unnecessary protections. We discuss this idea in Section 5.4. 

5.1.1 The relationship between randomized response and differential privacy. 

When setting e = log then it is well known that randomized response satisfies e-differential privacy. 
Also, for this parameter setting, differential privacy provides the same protection as randomized response 
for any given bit in the dataset - a bit corresponds to the record of one individual and differential privacy 
would allow a bit's value to be retained with probability at most e e /(l + e £ ) = p (and therefore flipped 
with probability 1—p). However, Theorem 5.7 shows that randomized response goes beyond the protection 
afforded by differential privacy by requiring stronger protection of the parity of larger sets of bits as well. 

Note that Kasiviswanathan et al. [23] proved a learning-theoretic separation result between randomized 
response and differential privacy which roughly states that randomized response cannot be used to efficiently 
learn a problem called MASKED-PARITY. That concept of parity involves solving a linear system of equa- 
tions in a d-dimensional vector space over the integers modulo 2. While very different from the notion of 
parity that we study, one direction of future work is to determine if our result about the semantic guarantees 
of randomized response can lead to a new proof of the result by Kasiviswanathan et al. [23] . 

5.2 FRAPP and PRAM 

In some cases, it may be difficult to derive the row cone of a privacy definition *}Mo. In these cases, it helps to 
have some notion of an approximation to a row cone from which semantic guarantees can still be extracted. 
One might wonder whether the Hausdorff distance [2] or some other measure of distance between sets might 
be a meaningful measure of the quality of an approximation. Unfortunately it is not at all clear what such 
a distance measure means in terms of semantic guarantees; finding a meaningful quantitative measure is an 
interesting open problem. 

7 A11 other guarantees are a consequence of them. 

8 In this setting, we are normally interested only in the parity of individual bits since each bit corresponds to the value of 
one individual's record. 
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Thus we take the following approach. If wc cannot derive rowconc^rio), our goal becomes to find a 
strictly larger convex cone r' that contains rowcone(^Mt)). The reason is that any linear inequality satisfied 
by r' is also satisfied by rowcone(*Ptio); the semantic interpretation of the linear inequality is then a guarantee 
provided by Thus the approximation may lose some semantics but never generates incorrect 

semantics. This idea leads to the following definition. 

Definition 5.8 (Approximation cone). Given a privacy definition <}3ri0 7 an approximation cone o/*}Md is 
a closed convex cone r' such that rowcone(Cptio) C r' . 

In this section, we apply this approximation idea to FRAPP [1], which is a privacy definition based on 
the perturbation technique PRAM [20]. Recall from Section 3.2.3 that the types of algorithms considered by 
FRAPP are algorithm 9JIq that have a transition matrix Q where the (a, b) entry, denoted by Pq (b — > a), 
is the probability that a tuple with value b gets changed to a. The algorithm SUIq modifies each tuple 
independently using this transition matrix. 

Definition 5.9 (Domain of FRAPP). Define TUV = {a l7 a 2 , . . . , on} to be the domain of tuples. Choose 
an arbitrary ordering for these values. Define the data domain to be I = {Di, D2, ■ ■ ■ } where each Di is a 
sequence of k tuples from TUV and the list D\ , Di , . . . is in lexicographic order. 

Definition 5.10 (7-FRAPP [1]). Given a privacy parameter 7 > 1, 7-FRAPP is the privacy definition 
containing all algorithms 9JIq that use transition matrices Q with the 7- amplification property [17] : for all 
tuple values a, b, c e TUV , p^[^°j < 7. 

We now construct an approximation cone for 7-FRAPP. If DJIq is an algorithm in 7-FRAPP with 
transition matrix Q, then it is easy to see that the matrix representation of !JTq, denoted by Mq, is: 

k 

M Q =(g)Q 

i=l 

(where k is the number of tuples in databases from I and (^) is the Kronecker product). 

Let ej be the column vector of length N that has a 1 in position j and in all other positions. Write 
p = (so that 7 = jz^)- The constraints imposed on Q by 7-FRAPP can then be written as: 

Vi,je{l,...,JV} : Q(pe i -(l-p)e j )h0 

where is the vector containing only components and a y b means that a— b has no negative components. 
Therefore every vector x that is the row vector of Mq, the matrix representation of 9JIq, must satisfy the 
constraints: 

V'i 'a- ./1 .//• {1 \'\ : M Q (^{pe ie -{l-p)e h )^±Q (9) 

Using these constraints we can define the Kronecker approximation cone for FRAPP. 

Definition 5.11. (Kronecker approximation cone K p ). Given a privacy parameter 7, let p = -^j. Define 

the Kronecker approximation cone, denoted by K p to be the set of vectors x that satisfy the linear constraints 
in Equation 9 (where ej e is the jf 1 column vector of the N x N identity matrix). 

Lemma 5.12. Let p = Then K p is an approximation cone for 7-FRAPP. 

Proof. See Appendix H. □ 
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The connection between the approximation cone K p of FRAPP and rowcone(9Jl rr ( p )), the row cone of 
randomized response, is clear once we rephrase the linear constraints that define rowcone(9Jt rr ( p )) in Theorem 
5.5 as follows: 



x e rowcone(9Jl r . r (p)) V i t , . . . , i k , j u . . . ,j k e {1, 2} : x ■ \^Q(pe' ie - (I - p)e' je ) J >0 

where ej e is the jf 1 column vector of the 2x2 identity matrix. 

Thus we can use Theorem 5.7, which gave a semantic interpretation for randomized response to derive 
some of the semantic guarantees provided by FRAPP. 

These guarantees are as follows. Suppose Bob is an attacker who satisfies the following conditions. 

• Bob believes that the tuples in the true dataset are independent, 

• Bob has ruled out all but two values for the tuple of each individual. That is, for each i, Bob knows that 
the value of tuple ti is either some value a* £ TUT or bi £ TUV. 

• For each tuple ti, Bob believes that U = di with probability qi and ti = h with probability 1 — <ft. 

then for any subset J of the tuples such that ti £ J only if qi > p = then if Bob believes P(parity( J) = 
1) > P(parity( J) = 0) then after seeing output u), Bob believes P(parity( J) = 1 | u>) > P(parity(J) =0 | w), 
and if Bob believes P(parity(J) = 0) > P(parity( J) = 1) then P(parity(J) = | w) > P(parity(J) = 1 | J). 
Here parity can be defined arbitrarily by either treating en or bi as a 1 bit. 

In the case of FRAPP, we also see that one of its guarantees is the protection of parity. This seems to 
be a general property of privacy definitions that are based on algorithms that operate on individual tuples 
independently. 

5.3 Additive Noise 

In this section, we analyze a different class of algorithms - those that add noise to their inputs. In the 
cases we study, the input domain is I = {. . . , —2, —1, 0, 1, 2, ... } and the algorithm being analyzed adds an 
integer-valued random variable to its input. In the first case that we study (Section 5.3.1), the algorithm 
adds a random variable of the form Z = X — Y where X and Y have the negative binomial distribution; this 
includes the geometric mechanism [19] as a special case. In the second case (Section 5.3.2), the algorithm 
adds a random variable from a Skellam distribution [38], which has the form Z = X — Y where X and Y 
have Poisson distributions. 

5.3.1 Differenced Negative Binomial Mechanism 

The Geometric (p) distribution is a probability distribution over nonnegative integers k with mass function 
p k (l— p). The negative binomial distribution, NB(p, r), is a probability distribution over nonnegative integers 
k with mass function ( fe+ ^ 1 )p' £ (l— p) r . It is well-known (and easy to show) that an NB(p, r) random variable 
has the same distribution as the sum of r independent Geometric(p) random variables. In order to get a 
distribution over the entire set of integers, we can use the difference of two independent NB(p, r) random 
variables. This leads to the following noise addition algorithm: 

Definition 5.13. (Differenced Negative Binomial Mechanism ^SIdnb{ p ,t))- Define ^SIdnb{ p ,t) t° be the 
algorithm that adds X — Y to its input, where X and Y are two independent random variables having the 
negative binomial distribution with parameters p and r. We call ^dnb(p.t) ihe differenced negative binomial 
mechanism. 

The relationship to the geometric mechanism [19], which adds a random integer k with distribution 
^rfp' fe ', is captured in the following lemma: 
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Lemma 5.14. %Rdnb{ p ,i), the differenced negative binomial mechanism with r = 1, is the geometric mech- 
anism. 



Proof. See Appendix J. □ 

The following theorem gives us the row cone of the differenced negative binomial mechanism. 

Theorem 5.15. A bounded row vector x = (..., X-2, X-i, Xo, x\, X2, ■■■ ) belongs to rowcone({'ifflrjNB(p,r)}) 
if for all integers k, 



Vfc: ^(-l) J 7 s (j ;T ^,r^ fe+J >0 



where p and r are the parameters of the differenced negative binomial distribution and fs{'\pl (!+?>)> r ) * s the 
probability mass function of the difference of two independent binomial (not negative binomial) distributions 
whose parameters are p/(l + p) (success probability) and r (number of trials). 

Proof. See Appendix K. □ 

To interpret Theorem 5.15 note that (1) the coefficients of the linear inequality are given by the distri- 
bution of the difference of two binomials, (2) the coefficients alternate in signs, and (3) for each integer fc, 
the corresponding linear inequality has the coefficients shifted over by k spots. 

One interpretation of Theorem 5.15, therefore, is that if an attacker has managed to rule out all possible 
inputs except k — r,k — r + 1, ...,k + r— 1, k + r and has a prior on these inputs that corresponds to 
the difference of two binomials (centered at k) then after seeing the sanitized output of 9JlDNB( P ,r)i the 
attacker will believe that the set of possible inputs {. . . ,k — 3, fc — 1, fc + 1, . . . } is not more likely than 
{. . . , k — 4, k — 2, k, k + 2, . . . }. Again we see a notion of protection of parity but for a smaller set of possible 
inputs, and note that initially this looks like a one-sided guarantee - the posterior probability of odd offsets 
from k does not increase beyond the posterior probability of the even offsets from k. 

However, what is surprising to us is that this kind of guarantee has many strong implications. To illustrate 
this point, consider 9JIdnb(p,i) which is equivalent to the geometric mechanism. The linear inequalities in 
Theorem 5.15 then simplify (after some simple manipulations) to — Xk—i + (p+l/p)xk — Xk+i > which means 
that a mechanism must satisfy for all fc, -P[m(k - 1) = u>] + {p + 1 /p)P[Wl(k) =u}- P[Wt(k + 1) = u>] > 0. 
Using these inequalities in the following telescoping sum, we see that they imply the familiar e-differential 
privacy constraints with e = — \ogp (so e £ = 1/p)). 

p- 1 P[£W(jfe) = w] - P[m(k - 1) = w] 

OO 

= ^p> (-p[m(k - i + j) = uj] + ( P + i/ P )p[m(k + j) = u] - p[m(k + 1 + j) = u]) > o 

3=0 

p _1 p[aji(jfe) = w] - p[m(k + 1) = w] 

oo 

= ^p 3 {-P[m{k - 1 - j) = oj] + { P + l/p)P[DJl(k - j) =oj}- P[m(k + 1 - j) = u]) > 

The take-home message, we believe, from this example is that protections on parity, even one-sided 
protections can be very powerful (for example, we saw how the one-sided protections in Theorem 5.15 can 
imply the two-sided protections in differential privacy). Thus an interesting direction for future work is to 
develop methods for analyzing how different guarantees relate to each other; for example, if we protect a 
fact X, then what else do we end up protecting? 

5.3.2 Skellam Noise 

In the previous section, we saw how (differenced) negative binomial noise was related to protections against 
attackers with (differenced) binomial priors, thus exhibiting a dual relationship between the binomial and 
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negative binomial distributions. In this section, we study noise distributed according to the Skellam distri- 
bution [38], which turns out to be its own dual. 

The Poisson(A) distribution is a probability distribution over nonnegative integers k with distribution 
e tt- A random variable Z has the Skellam(A 1 , A 2 ) distribution if it is equal to the difference X — 
Y of two independent random variables X and Y having the Poisson(Ai) and Poisson(A2) distributions, 
respectively [38]. 

Theorem 5.16. Let the input domain I = {. . . , —2, —1,0,1,2,...} be the set of integers. Let ^R s kell(X lt Xa) 
be the algorithm that adds to its input a random integer k with the Skellam(Xi, A2) distribution and let 
/z(-;Ai,A2) be the probability mass function of the Skellam{\\, A2) distribution. A bounded row vector x = 
(. . .,x-2,x-i,x ,xi,x 2 , ■■■) belongs to vawcone({Tl ske u(x 1 ,x 2 )}) if for all integers k, 

CO 

E (-l)VzO';Ai,A2)a* + j>0 

j=-oo 

Proof. See Appendix I. □ 

As before, we see that Skellam noise protects parity if the attacker uses a Skellam prior that is shifted 9 
by k so that the posterior probability of the set {. . . , k — 3, k — 1, k + 1, k + 3, . . . } cannot be higher than 
that of the set {. . . , k - 2, k, k + 2, . . . }. 

5.3.3 Other distributions. 

When the input domain is the set of integers there is a general technique for deriving the row cone corre- 
sponding to an algorithm that adds integer- valued noise to its inputs. If the noise distribution has probability 
mass function /, then the matrix representation of the noise- addition algorithm is a matrix M (with rows 
and columns indexed by integers) whose entry is f(i — j). One can take the Fourier series transform 
(characteristic function) f(t) = Y^tL-oo f{^)e tU . Let g be the inverse transform of l/f(t), if it exists. Then 
the inverse of the matrix M is a matrix whose entries are g(i — j). In combination with Theorem 

5.1, this allows one to derive the linear constraints defining the row cone. We used this approach to derive 
the results of Sections 5.3.1 and 5.3.2 and the proof of Theorem 5.15 provides a formal justification for this 
technique. 

5.4 Relaxing Privacy Definitions 

As we saw in Section 5.1, a privacy definition ^rio may end up protecting more than we want. In such cases, 
we can manipulate the rowcone(93tit)) to relax it. This will give us a new row cone R and will allow us to 
create a privacy definition Cptit/ of the form: 9Jt G *prit/ if and only if every row of the matrix representation 
of 971 belongs to R. 

To relax rowcone(^kio), we will replace the linear constraints that define it with weaker linear constraints. 
An appropriate tool is Fourier-Motzkin elimination [8], which will produce a new set of linear constraints 
which arc implied by the old constraints. The new constraints will have fewer variables per constraint. 

We illustrate this technique by continuing Example 5.6 (randomized response on databases with k = 2 
tuples). Rewriting equations 4 and 7 to isolate x i and setting a = p/(l — p), we get 

ckeoo + Xn/a — x\o > 2^01 > axoa + ax\\ — o?X\^ 
=> X11 < axio 

Recalling that i n is shorthand for P(9Jt(ll) = uo) and x w is shorthand for P(9Jt(10) = to) we see that 
Fourier-Motzkin elimination on the original constraints yielded one of the constraints of (In jz^) -differential 
privacy. Applying Fourier-Motzkin elimination on the other equations in Example 5.6 yields the rest of the 
differential privacy constraints. Thus we see that differential privacy is a natural relaxation of randomized 
response. 

9 i.e. the prior has the distribution of Z + k where k is a constant and Z has the Skellam distribution. 
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6 Conclusions 



We view privacy as a type of theory of information where the goal is to study how different algorithms 
filter out certain pieces of information. To this end we proposed the first (to the best of our knowledge) 
framework for extracting semantic guarantees from privacy definitions. The framework depends on the 
concepts of consistent normal form CNF(^Mo) and rowcone(^3rio). The consistent normal form corresponds 
to an explicit set of trusted algorithms and the row cone corresponds to the type of information that is 
always protected by an output of an algorithm belonging to a given privacy definition. The usefulness of 
these concepts comes from their geometric nature and relations to linear algebra and convex geometry. 

There are many important directions for future work. These include extracting semantic guarantees that 
fail with a small probability such as various probabilistic relaxations of differential privacy (e.g., [33, 5]). In 
contrast, the row cone is only useful for finding guarantees that always hold. It is also important to study 
formal ways of relaxing/strengthening privacy definitions and exploring the relationships between different 
types of semantic guarantees. 
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A Proof of Theorem 4.4 



Theorem A.l. (Restatement and proof of Theorem 4.4). Given a privacy definition *}Md, its consistent 
normal form CNF(*JMo) is equivalent to the following. 

1. Define tyxiv^ to be the set of all (deterministic and randomized algorithms) of the form Ao where 
9)1 € ^PriO, rangc(9Jt) C domain(.4), and the random bits of A and 9Jt are independent of each other. 

2. For any positive integer n, finite sequence 97ti, . . . , 97t„ and probability vector p = (pi, . . . ,p n ), use 
the notation choice p (9Jti, . . . , 9Jt„) to represent the algorithm that runs 9Jtj with probability pi. Define 
*PriO < - 2 ' to be the set of all algorithms of the form choice^ (9Jti , . . . , 9Jl„) where n is a positive integer, 
9JTi, . . . , 9Jt„ € ^Mt/ 1 ), and p is a probability vector. 

3. Set CNF(q?rit>) =qtao (2) . 

Proof. We need to show that ^rio*- 2 ' satisfies Axioms 4.1 and 4.2 consistent and that any other privacy 
definition that satisfies both axioms and contains tyxvo must also contain ^tit/ 2 ^. 

By construction, ?JJtit)^ satisfies Axiom 4.2 (convexity). To show that ^tit/ 2 ^ satisfies Axiom 4.1 (post- 
processing), choose any 9)1 € ^tit/ 2 * 1 and a postprocessing algorithm A. By construction of ^Mt/ 2 \ there 
exists an integer m, a sequence of algorithms 9Jt^\ . . . , 9Jt^ with each QJt- 1 ^ € ^tit/ 1 **, and a probabil- 
ity vector p = (pi, ■ ■ ■ ,p m ) such that 9Jt = choice p (2H^ 1 ' 1 , . . . , 9Jl$). It is easy to check that AodR = 
choice p (^o^ 1) ,...,^oO»lW). By construction of <]Mt) (1) , AoM^ € qtat) (1) because Wl^ e qtat) (1) . 
Therefore, by construction of *}Mt/ 2 \ .4o9Jt g *JMt/ 2 ) and so *JMt)^ satisfies Axiom 4.1 (post-processing). 

Now let *prit> be some privacy definition containing Cprio and satisfying both axioms. By Axiom 4.1 
(post-processing), Cprit/ 1 ^ C Cprio'. By Axiom 4.2 (convexity) it follows that ?JJtiD^ C ^Mt/. Therefore 
CNF(q3rio) = q?rit) (2) C qjtio'. □ 

B Proof of Corollary 4.5 

Corollary B.l. (Restatement of Corollary 4.5). 

7/*pri0 = {0)1} consists of just one algorithm, CNF(*}Md) is the set of all algorithms of the form AoWl, 
where range(97t) C domain(_4) and the random bits in A and 971 are independent of each other. 

Proof. We use the notation defined in Theorem 4.4. The corollary follows easily from process described in 
Theorem 4.4 and the fact that 

choicc^A o9Jt, . . . ,An oWl) = (choice*^, . . . , _4„)) ° 
so that the process of computing CNF(*}Mt)) has stopped after the first step. □ 

C Proof of Theorem 4.7 

Theorem C.l. (Restatement and proof of Theorem 4.7). rowcone(*}Mo) is a convex cone. 

Proof. Choose any v — {v-\_,v 2 , . . . ) G rowcone(*prio). Then by definition cv e rowcone(*}Mo) for any c > 0. 
This takes care of the cone property so that we only need to show that rowcone(*Prio) is a convex set. 

Choose any vectors x — (xi, x 2 , ■ ■ ■ ) € rowcone(*Prio), y = (yi, y 2 , ■ ■ ■ ) € rowcone(*Prio), and number t 
such that < t < 1. We show that tx + (1 — t)y e rowcone(Cprio). If either x — or y = then we are done 
by the cone property we just proved. Otherwise, by definition of row cone, there exist constants c\,c 2 > 0, 
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algorithms SDti , OK2 G CNF(*}Mo), and sanitized outputs oj\ G range(9JTi), u; 2 € rangc(9JT 2 ) such that x/ci is 
a row of the matrix representation of 9Jti and y/c 2 is a row of the matrix representation of 9Jt 2 : 



£ = (ciP[aHi(Di)=wi], ciP[aKi(D 2 )=wi],...) 
y = (c 2 P[aJt 2 (£»i) =wa], c 2 P[$H 2 p 2 ) = w 2 ],...) 



Let _4i be the algorithm that outputs w if its input is lo\ and a/ otherwise. Similarly, let A2 be the algorithm 
that outputs w if its input is w 2 and u/ otherwise. Define = A10OJI1 and DJl' 2 = _4 2 o9Jt 2 . Then by 
Theorem 4.4 (and the post-processing Axiom 4.1), M[, M' 2 G CNF^rio) and 



x = (ciP[Orti(Di) = w], c 1 P[0R / 1 ( J D 2 ) = u], . 
y = (c2P[0^(Di) = w], c 2 P[0R 2 ( J D 2 ) = w], . 

Now consider the algorithm 971* which runs OJT^ with probability tCi+ *^_ t ) e2 an d runs 9^ 2 with probability 
t Cl ( +7i-t 2 ) C2 • B y Theorem 4.4, DJl* G CNF(qMo). Then for alH = 1, 2, . . . , 

teiP(SWi(A) = w) + (1 - <)c 2 P(9Jl' 2 (A) = w) 



p(ar(A) = w) = 



id + (1 - i)c 2 

to, + (1 - 
tci + (1 - t)c 2 



Thus the vector is the row vector corresponding to u> of the matrix representation of Wl* and is 

therefore in rowcone(Cptio). Multiplying by the nonnegative constant tci + (l — t)c 2 , we get that tx + (l — t)y G 
rowcone(CPrio) and so rowcone(^Ml)) is convex. □ 



D Proof of Theorem 5.1 

Theorem D.l. (Restatement and proof of Theorem 5.1). Let I be a finite or countably infinite set of 
possible datasets. Let DJl* be an algorithm with domain(97t*) = I. Let M* be the matrix representation of 
DJl* (Definition 2.1). If (M*)^ 1 exists and the L\ norm of each column of (M*)^ 1 is bounded by a constant 
C then 

(1) A bounded row vector x G rowconc({9Jt*}) if and only if x ■ m > for every column m of (M*) _1 . 

(2) An algorithm 9Jt, with matrix representation M, belongs to CNF({9Jt*}) if and only if the matrix 
M(M*)^ 1 contains no negative entries. 

(3) An algorithm DJl, with matrix representation M, belongs to CNF({9Jt*}) if and only if every row of M 
belongs to rowcone({9Jt*}). 

Proof. We first prove (1). If x is the vector then this is clearly true. Thus assume x ^ 0. If x G 
rowcone({9Jt*}) then by definition of the row cone and by Corollary 4.5, x — yM* where y is a bounded 
row vector and has nonnegative components. Then x(M*)~ x = yM* (M*)^ 1 = y and so x ■ m > for every 
column m of (M*) _1 . 

For the other direction, we must construct an algorithm A with matrix representation A such that for 
some c > 0, cx is a row of AM* (by definition of row cone and Corollary 4.5). Thus, by hypothesis, suppose 
x ■ m > for each column vector m of (M*) _1 and consider the row vector y = x(M*)~ x which therefore 
has nonnegative entries. Since x is bounded and ||ra||i < C for each column vector m of (M*) -1 then 
\x ■ m\ < ||£||oo|| m l|i < | |ooC (by Holder's Inequality [35]) so that y is bounded. Choose a c so that cy is 
bounded by 1. Consider the algorithm A that has a matrix representation A with two rows, the first row 
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being cy and the second row being 1 — cy (A is an algorithm since cy and 1 — cy have nonncgative components 
and the column sums of A are clearly 1). A is the desired algorithm since ex is a row of AM* . 

To prove (2) and (3), note that if an algorithm has matrix representation M, then M(M*)~ 1 contains 
all the dot products between rows of M and columns of (M*) _1 . Therefore, the entries of M(M*)~ 1 are 
nonnegative if and only if every row of M is in the rowcone({9Jt*}) (this follows directly from the first part 
of the theorem). Thus (2) and (3) are equivalent and therefore we only need to prove (2). 

To prove (2), first note the trivial direction. If Eft e CNF({2H*}) then by definition every row of M 
is in the row cone (and so by (1) all entries of M(M*) _1 arc nonnegative). For the other direction, let 
A = M(M*)~ 1 (which has no negative entries by hypothesis). If we can show that the column sums of A 
are all 1 then, since A contains no negative entries, A would be a column stochastic matrix and therefore 
it would be the matrix representation of some algorithm A. From this it would follow that AM* — M and 
therefore A o M* = M (in which case M £ CNF(9Jl*) by Theorem 4.4). 

So all we need to do is to prove that the column sums of A arc all 1. Let 1 be a column vector whose 
components are all 1. Then since M is a matrix representation of an algorithm (Definition 2.1), M has 
column sums equal to 1, and similarly for M* . Thus: 

T T M*{M*)- 1 
and therefore 

FM(Ar)- 1 

r^M*)- 1 

and so the column sums of A are equal to 1. This completes the proof of this theorem. 

□ 

E Proof of Lemma 5.4 

Lemma E.l. (Restatement and proof of Lemma 5.4). Given a privacy parameter p, define q = max(p, 1— p). 
Then 

. CNF({»t rr(p) }) = CNF({9Jl rr(9) }). 

• If p = 1/2 then CNF({50? rr ( p )}) consists of the set of algorithms whose outputs are statistically indepen- 
dent of their inputs (i.e. those algorithms 371 where P[9Jt(Z)j) = ui] = P[$Jl(Dj) = u>] for all Di,Dj £ I 
and uj £ rangc(9Jl) ), and therefore attackers learn nothing from those outputs. 

Proof. Consider the algorithm 3Jt rr ( ) which always flips each bit in its input. It is easy to see that 
9Jt rr ( ) o9Jt rr ( p ) = 9JL. r (i_ p ) and9Jt rr ( ) o97t rr (i_ p ) = DJl rr ^ p y From Theorem 4.4, it follows that CNF({9Jl rr ( p )}) = 
CNF({9Jt rr(1 _ p) }) and therefore CNF({9Jt rr(p) }) = CNF({9Jt rr((?) }). 

Clearly, the output of 9tt rr (i/2) is independent of whatever was the true input table D £ I. By Theorem 
4.4, all algorithms in CNF({9Jl rj ,( 1 / 2 )}) have outputs independent of their inputs. For the other direction, 
choose any algorithm 9JT whose outputs are statistically independent of their inputs. Then it is easy to 
see that 971 = 9Jto 9Jt rr (i/2); that is, 9JT and 9Jt o 9Jt rr (i/2) have the same range and P(SH(Dj) = co) = 
P([9Jlo9R rr(1/2) ](£) i ) = 0) for all A £ I and u £ range(OJt). Thus M £ CNF({9K rT . ( i /2 )}). 

Clearly, when the output is statistically independent of the input, an attacker can learn nothing about 
the input after observing the output. □ 

F Proof of Theorem 5.5 

Theorem F.l. (Restatement and proof of Theorem 5.5). Given input space I — {Di, D 2 k} of bit strings 
of length k and a privacy parameter p > 1/2, 
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• A vector x = (xi, . . . , x 2 k) <E rowcone({!Jft rr ( p )}) if and only if for every bit string s of length k, 

jp p ham( S ,Z^)( p _ 1 ^-ham(,,D j ) a .. > Q 
i=l 

where ham(s, Dj) is the Hamming distance between s and D^. 

• An algorithm 9JT with matrix representation M belongs to CNF({9Jt rr ( p )}) if and only if every row of M 
belongs to rowconc({9Jt rr ( p )}). 

Proof. Our strategy is to first derive the matrix representation of 9Jt rr ( p ), which we denote by M rr ( p y Then 
we find the inverse of M rr ( p ) and apply Theorem 5.1. Accordingly, we break the proof down into 3 steps. 
Step 1: Derive M rr ( p ) . Define B to be the matrix 



B = 



P 1 ~P 
1 ~P P 



Recall that the Kronecker product Cffiflofanmxn matrix C and m' x n' matrix D is the block matrix 

(di-D ... c ln D \ 
: ; of dimension mm' x nn'. An easy induction shows that the matrix representation M rr ( p ) 

c ml D ... c m ' n D / 

is equal to the fc-fold Kronecker product of B with itself: 



■GO - 



B 



i=i 



The entry in row i and column j of M rr / p ) is equal to P[$Jl rr /p\(Dj) — Di] and a direct computation 
shows that this is equal to 



P 



Step 2: Derive (M rr ( p )) 1 . It is easy to check that 



B- 1 



and therefore 



1 ( P -(1-P) 
2p-lV-(l-P) P 



i=l 



A comparison with shows that the we can calculate the entry in row i and column j of (M rr ( p )) _1 
i=i 

by taking the corresponding entry of M rr ( p ) and replacing every occurrence of 1 — p with — (1 — p) = p — 1. 
Thus the entry in row i and column j of (M,^^)^ 1 is equal to 



(2p-l) feJ 

Therefore each column of (M rr ( p )) _1 has the form: 



ham(Di ,r>j ) ^ _ fe— ham(D, ,Dj ) 



(2p-l) 



pham(s.Oi)^ ^/c-ham(s,Di) 

l ham(s,D 2 ) ^ _ I)' 0- ham(s.fl 2 ) 



^pham(s,D 2fc ) ^ _ ]^fc-ham(s,.D 2 fc ) 
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Step 3: Now we apply Theorem 5.1 and observe that if mW is the i th column of (M rr ( p )) 1 , then, since 
p > 1/2 and 2p — 1 > 0, the condition x • toW is equal to the condition 

^- p ham( S: _D 3 ) ( - p _ 1 ^-ham( S , J D J )^. > q 

where s — Di. □ 



G Proof of Theorem 5.7 

Theorem G.l. (Restatement and proof of Theorem 5.7). Let p be a privacy parameter and let I = 
Di,... ,D 2 k. Let 9Jt be an algorithm that has a matrix representation whose every row belongs to the row 
cone of randomized response. If the attacker believes that the bits in the data are independent and bit i is 
equal to 1 with probability qi, then 2H protects the parity of any subset of bits that have prior probability > p 
or < 1 — p. That is, for any subset {t\, . . . , £ m } of bits of the input data such that qe } > p V g^ < 1 — p for 
j = 1, . . . , m, the following holds: 

• If P(parity( J) = 0) > P(parity(J) = 1) then P(parity(J) = | OJt(data)) > P(parity(J) = 1 | £0t(data)) 

• // P(parity(J) = 1) > P(parity(J) = 0) then P(parity(J) = 1 | SDt(data)) > P(parity(J) = | 9rt(data)) 

Furthermore, an algorithm 9Jt can only provide these guarantees if every row of its matrix representation 
belongs to rowcone({9Jt rr ( p )}). 

Proof. We break this proof up into a series of steps. We first reformulate the statements to make them easier 
to analyze mathematically, then we specialize to the case where J = {1, . . . , k} is the set of all bits in the 
database. We then show that every 9Jt whose rows (in the corresponding matrix representation) belong to 
rowcone(9Jt rr ( p )) has these semantic guarantees. We then show that only those DJl provide these semantic 
guarantees. Finally we show that those results imply that the theorem holds for all J whose bits have prior 
probability > p or < 1 — p. 

Step 1: Problem reformulation and specialization to the case when J = {1, . . . , k}. Assume J = {1, . . . , k} 
so that for all bits j, either qj > p or qj < 1 — p. 

First, Lemma 5.4 allows us to assume that the privacy parameter p > 1/2 without any loss of generality: 
the case of p = 1/2 is trivial since the output provides no information about the input so that parity is 
preserved; in the case of p < 1/2, the row cone and CNF are unchanged if we replace p with 1 — p. 

Second, we need a few results about parity. An easy induction shows that: 

i - n a - 2<&) 

P (parity (data) = 1) = 

i + n a - 2<&) 

P (parity (data) = 0) = — l — 

in particular, if all of the qj ^ 1/2 then P(parity(data) = 1) ^ P(parity(data) = 0) so that one parity has 
higher prior probability than the other. 

When J is the set of all k bits, then for all qj, qj ^ 1/2 and so the parities cannot be equally likely a priori, 
the statement about protection of parity can be rephrased as P(parity(data) = 0) — P(parity(data) = 1) 
and P(parity(data) = | 9Jt(data)) — P(parity(data) = 1 | 9Jt(data)) have the same sign or the posterior 
probabilities of parity are the same. Equivalently, 

< (p[parity(data) = 0] - P [parity (data) = 1]) 

x (p[parity(data) = | 9Jt(data)] - P [parity (data) = 1 | OJt(data)]) (10) 
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Now, it is easy to see that 



and 



P(parity(data) = 0) — P(parity(data) = 1) 



k 






(g)(-<&, l-q 3 ) 




0(1,1) 






.7=1 


k 






n [(-«»• . 1 -??•)• 





P[parity(data) = | SDt(data)] - P[parity(data) = 1 | 9Jt(data)] 



= a 



)(-%,! -Qj) 



3=1 



(11) 



(12) 



where a is a positive normalizing constant and x is a vector of the matrix representation of 9Jt. So, by 
Equations 10, 11, and 12, the statement about protecting parity is equivalent to 



Vx e rowconc({9Jt r ,.( p )}) : < | ] j [(— <Zj, 1 - qj) ■ (1, 1) 



(13) 



Step 2: Show that if for all j, g 7 - > p V g ? < 1 — p then the constraints in Equation 13 hold (i.e. the most 
likely parity a priori is the most likely parity a posteriori) . 

It follows from Corollary 4.5 that every 9JT € CNF({9Jt rr ( p )}) has the form Ao 37t rr ( p ) and so, by Theorem 
5.1, x is a row from the matrix representation of an 971 € CNF({3Jt rr ( p )}) if and only iix G rowcone({3Jl rr ( p )}). 
This means that ever such x is a nonnegative linear combination of rows of the randomized response algorithm 
9ttrr(p) • Thus it suffices to show that 



k 

<> I II [(-%, I-*)" (1,1) 

U =1 



for each vector m in M rr ( p ) (the matrix representation of 9Jl rr ( p )). It is easy to check that 



(14) 



M r 



■(p) 



P 1 -p 
1 -P P 



and so every vector m that is a row of M rr ( p ) has the form Vi where v,- L = (p, 1 — p) or (1 — p, p). Thus 

i=l 

right hand side of Equation 14 has the form: 



n [((-*, i - Qi) • a, i)) * ((-«,-, i - *) • vt) 

where — (p, 1 — p) or (1 — p, p). Each term in this product is either 

(1 - 2 Qj ) * [(1 - p)(l - qj) - q jP ] = (1 - 2qj)[l - p - <&] 



(15) 
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or 

(1 - 2 Qj ) * [p(l - qj) - qj (l - p)} = (1 - 2q,)[p - qj] 

Recalling that we had assumed p > 1/2 without any loss of generality, both of these terms are nonnegative 
if 1j > P > 1/2 and they are also nonnegative when Qi < (1 — p) < 1/2. Thus the product in Equation 15 
is nonnegative from which it follows that the conditions in Equation 14 and 13 are satisfied which implies 
Equation 10 is satisfied, which proves half of the theorem when restricted to the special case of J = {!,...,&}. 



1. 



k 



Step 3: Show that if 971 is a mechanism that protects parity whenever gj > p V <? 7 < 1 — p for i 
then every row x in its matrix representation belongs to rowconc({9Jt rr ( p )}) . 

We actually prove a more general statement: if QJt is a mechanism that protects parity whenever qj = 
p V qj = 1 — p for i = 1, . . . , k then every row x in its matrix representation belongs to rowcone({9Jt rr ( p )}). 

Recalling the argument leading up to Equation 13 in Step 2 (where we reformulated the problem into a 
statement that is more amenable to mathematical manipulation), we need to show that if 



0< f[ [(-qj, 1 - 9i) • (1, 1) 



\3 = 1 



)(-Qj, 1 - Qj) 



3 = 1 



(16) 



whenever qj = p or qj = 1 — p then x € rowcone({9Jt rr ( p )}). 
Define the function: 

!-l ifa<0 
if a = 
1 if a > 

Simplifying Equation 16 (by computing the dot product in the first term, looking just at the sign of that 
dot product, and then combining both terms), our goal is to show that if 



< 



(-Qj, l-^)*sign(l-2g j 



3=1 



(17) 



whenever qj = p or qj = 1 — p then x e rowcone({9Jt rr ( p )}). 

Now, when qj = p (and recalling that we have assumed p > 1/2 with no loss of generality in Step 1), 
then 

{-Qj, 1 - Qj) * sign(l - 2 qj ) = (p, - (1 - p)) 

and when q 3 ■ = 1 — p then 

(-Qj, 1 - Qj) * sign(l - 2qj) = (-(1 - p), p) 

Thus asserting that Equation 17 holds whenever qj equals p or 1 — p is the same as asserting that the 
vector: 



?T(T\ 1 / P -(1- 

X )&2p-l {-(1-P) P 



(1-P) 



(18) 



has no negative components. However, the randomized response algorithm 5D? rr ( p ) has a matrix representa- 
tion M rr ( p ) whose inverse (which we also derived in the proof of Theorem 5.5) is 



p) 



p 



Thus the condition that the vector in Equation 18 has no negative entries means that x T (M^^)^ 1 has 
no negative entries and so the dot product of x with any column of (M rr ( p )) -1 is nonnegative. By Theorem 
5.5, this means that x e rowconc({9Jt rr ( p )}). 
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This concludes the proof for the entire theorem specialized to the case where J = {1, . . . , k}. In the next 
step, we generalize this to arbitrary J. 

Step 4: Now let J = {li, . . . , £ m }. First consider an "extreme" attacker whose prior beliefs qj are such that 
qj = or qj = 1 whenever j J. It follows from the previous steps that such an attacker would not change 
his mind about the parity of the whole dataset. Since the attacker is completely sure about the values of 
bits outside of J, this means that after seeing a sanitized output to, the attacker will not change his mind 
about the parity of the bits in J. 
Now, note that showing 

• If P(parity( J) = 0) > P(parity( J) = 1) then 

P (parity (J) = | OTt(data) = lu) > P(parity( J) = 1 | OTt(data) = lu) 

• If P(parity( J) = 1) > P(parity( J) = 0) then 

P (parity (J) = 1 | OTt(data) = lu) > P(parity( J) = | OTt(data) = lu) 

is equivalent to showing 

• If P(parity( J) = 0) > P(parity( J) = 1) then 
P(parity(J) = A Ort(data)) > P(parity(J) = 1 A 97t(data)) 

• If P(parity( J) = 1) > P(parity( J) = 0) then 

P (parity (J) = 1 A OJT(data) = lu) > P (parity (J) = A QJt(data) = lu) 

since we just multiply the equations on both sides of the inequalities by the positive number P(2H(data) = lu). 

Now consider an attacker Bob such that qj > p or qj < l—p whenever j G J and there are no restrictions 
on qj for j £ J. There is a corresponding set of 2 A: ~I J I "extreme" attackers for whom P(bit j = 1) = qj for 
j E J and P(bit j = 1) € {0, 1} otherwise. 

Bob's vector of prior probabilities over possible datasets 

(P[data = P>i], P[data = D 2 ], ■ ■ ■ ) 

is a convex combination of the corresponding vectors for the extreme attackers, and thus Bob's joint distri- 
butions: 

P(parity( J) = 1 A OT(data) = lu) 

and 

P(parity( J) = A OT(data) = lo) 

are convex combinations of the corresponding posteriors for the extreme attackers, and the coefficients of 
this convex combination are the same. 

Note that Bob and all of the extreme attackers have the same prior on the parity of J. However, we 
have shown that the extreme attackers will not change their minds about the parity of J. Therefore if they 
believe P(parity( J) = 1 A 9Jt(data) = lu) is larger than the corresponding probability for even parity, then 
Bob will have the same belief. If the extreme attackers believe, after seeing the sanitized output lu, that even 
parity is more likely, then so will Bob. Thus Bob will not change his belief about the parity of the input 
dataset. □ 

H Proof of Lemma 5.12 

Lemma H.l. (Restatement and proof of Lemma 5.12). Let p = Then K p is an approximation cone 

for 7-FRAPP. 

Proof. Clearly K p is a closed convex cone. Thus we just need to prove that rowcone(7-FRAPP) C K p . 
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Choose any 9JIq € 7-FRAPP, with matrix representation Mq. Clearly 



k 



Af Q =(g)Q 



and Q satisfies the constraints 



Vi,je{l,...,JV} : Q(pe i -(l-p)e j )hO 



where is the i th column vector of the N x N identity matrix and a>b means that a — b has no negative 
components. It follows from the properties of the Kronecker product that 



Thus each row of the matrix representation of VJIq satisfies a set of linear constraints. 

From Theorem 4.4, we see that CNF(7-FRAPP) can be obtained by first creating all algorithms of the 
form Ao^XSIq (for 3JIq € 7-FRAPP) and then by taking the convex combination of those results (i.e. creating 
algorithms that randomly choose to run one of the algorithms generated in the previous step) . However, the 
matrix representation of A o 9JIq is equal to AMq (where A is the matrix representation of A) and every 
row in AMq is a positive linear combination of rows in 9Hq. Thus every row of the matrix representation of 
AoOJIq also satisfies the constraints defining K p . Finally, creating an algorithm A* that randomly choose 
to run one algorithm in {Ai o WIq 1 , . . . , Ah 9#Q h } means that the rows in the matrix representation of A* 
is a convex combination of the rows appearing in the matrix representations of the At o dJlq i and so those 
rows also satisfy the constraints that define K p . Therefore rowcone(7-FRAPP) C K p . □ 

I Proof of Theorem 5.16 

Theorem 1.1. ^Restatement and proof of Theorem 5.16,) Let the input domain I = {. . . , —2,-1, 0, 1, 2, ... } 
be the set of integers. Let 9Jt s jfc e «(Ai, A 2/ ) be the algorithm that adds to its input a random integer k with 
the Skellam(Xi, X2) distribution and let /z(-;Ai,A2) be the probability mass function of the Skellam{\\, A2) 
distribution. A bounded row vector x = (. . . , X-2, £-1, Xo, x\, X2, ■ ■ ■) belongs to rowcone({9Jl s fc e ;;^ li a 2/ )}) if 
for all integers k, 



Note that fx is the probability mass function for a Poisson(A!) random variable X while fy is the probability 
mass function of the negative of a Poisson(A 2 ) random variable Y. 

With this notation, the Skellam distribution is the distribution of the sum X+Y. Therefore its probability 
mass function satisfies the following relation 



Vii,...,ifc,ji,...,jfce{l,...,JV} : M Q \§?)(pe ie -{l-p)e je ) \h0 



(19) 



]T (-i) j fz(r,\u\2)x k+j >o 



Proof. For integers k, define the functions 




fz(k; Ai, A 2 ) - ]T fx(k- j)f Y (j) = (fx * f Y )(k) 



j=-oo 
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where fx * fy is the convolution operation. 
Now for each integer k define 



g x (k) = (-l) k .fx(k) 
g Y (k) = (-l) k f Y (k) 
9z{k) = (gx*9Y){k) 

OO 

= E 9x{k-j)gy{j) 

j=-oo 



E (-V k - j fx(k-j)(-iyf Y (j) 



J=-00 

OO 



= (-l) fe E Mk-j)fy(j) 



j = -oo 



= (-l) k f z (k:\ 1 ,X 2 ) 
We will need the following calculations: 

OO 

(gx*fx)(k) = E gx(k-j)f x (j) 

j=-oo 

OO 

- E (-V k - j fx(k-j)fx(j) 

j=-oo 
k 

3=0 

(since fx is for negative integers also note the summation is if j > k) 

, r -2X, y (~Ai) fc - j Aj 
= E . A . y!| ;! 



J=0 
e -2Ai 

= H{fc>0} jTj (Ai - Al) 

f e - 2Ai if fc = 

I otherwise 
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Similarly 

CO 

(9Y*fY)(k) = E 9Y(k-j)f Y (j) 

j=-oc 
oo 

= E Mk-j)fY(j) 



]=-cc 




J2(-l)^f Y (k-j)fy(j) 
j=k 

(since fy is for positive integers also note the summation is if k > j) 

1{k< - 0}£ ^(-{k-mnv. 

1{ ^ 0}e U [(-*)- J]I i! 
(replacing the dummy index j with — j) 

e- 2A2 ^ /(-fc) N 



^o>^E(V)(-^) ( - fc) -^ 



J=0 

-2A 2 

i {fe <o } ^ T T T (A 2 -A 2 )(- fe ) 



e 

fe- 2A2 if k = 
1 otherwise 

From these calculations we can conclude that 

(Sz*/z(-;Ai,A 2 ))(/c) = {{gx * gy) * {fx * fy))(k) 
= ((gx * fx) * (gv * fy))(k) 

(since convolutions are commutative and associative) 
J e -2(A 1 + A 2 ) iffc = 

1 otherwise 

These convolution calculations show that the matrices M^) and M^ 9 \ whose rows and columns are indexed 
by the integers and which are defined below, are inverses of each other. 

Aff/j) = entry of M« 

= fz(i- j;Ai,A 2 ) 
M ( ( g } = entry of M<»> 
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To see that they are inverses, note that the dot product between row r of and column c of is 

CO CO 

J — -co j — — CO 

CO 

= Yl fz{r-c-r,X 1 ,\2)e 2{XM 9z{3) 

j=-oo 

= e 2 ^+ A2 )( 5 z*/z(-;Ai,A 2 ))(r-c) 

{1 if r = c 
otherwise 

Now, clearly M^> is the matrix representation of 9Jl s kcii(Ai, a 2 ) so that we can again use Theorem 5.1 and 
the observation that gz(k) = (— l) fc /z(fc; Ai, A 2 ) so that column c of M^) = (M'^) _1 is the column vector 
whose entry j is (—iy~ c fz{j — c; Ai, A 2 ). 

Note that the columns of have bounded L\ norm since the absolute value of the entries in any 

column are proportional to the probabilities given by the Skellam distribution. 

The proof is completed by the observation that for any x = (. . . , x_ 2 , X-i, x , x\, £ 2 , . . . ), 

oo oo 

E (-l) i - c h(j-c;Xi,X2)x j = (-l) J M7;Ai,A 2 )x J+c 

j — — CO j — — CO 

□ 



J Proof of Lemma 5.14 

Lemma J.l. (Proof and restatement of Lemma 5.14). ^dnb( p ,i), the differenced negative binomial mech- 
anism with r = 1, is the geometric mechanism. 

Proof. We need to show that the difference between two independent Geometric(p) distributions has the 
probability mass function f(k) = i^pl fc L 

Let X and Y be independent Geometric(p) random variables and let Z = X — Y . Then 



P(Z = k) 
Combining both cases, we get 



£ P(X = j + k)P(Y = j) if k > 

3=0 

CO 

£ P(X=j)P(Y = j + \k\) iffc<0 

, i=0 



p(z = k) = £(i- P y+i fc i(i- P y 

3=0 

CO 

= (i- P )V fc| E(p 2 ) J 

3=0 

1 p)p (i-p)(i+p) 

= ^ |fe| 
i +p 

□ 
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K Proof of Theorem 5.15 

We first need an intermediate result. 

Lemma K.l. Let X and Y be independent random variables with the Binomial(j^,r) distribution (where 
pj (1 + p) is the success probability and r is the number of trials). Let Z = X — Y and let Jb (k; ^pj , rj = 
P(Z = k) for integers k — —r, . . . , 0, . . . r. Define the function h as h(k) = (— l) fe /s (jt; ^fi' 7 *) . The Fourier 
series transform h of h (defined as h(t) — J27L-oo h(£)e llt ) is equal to 

m = j^- Fr {l-P^Y(l-pe^Y 

Proof. Define the random variable Y' = -Y . Then X + Y' = Z. Thus 

OO 

h(t) = yi h ^y lt 

£=-co 

OO 

= E (-^) e e M P(Z = £) 



e M 



t=-oo 

OO 



= ]T eift (- 1 )' E P(x = e-j)P(Y' = j) 

£— — oo j — — oo 



OO 



= E eUt E (-^ e - i P(x = £-j)(-iyp(Y'=j) 

£— — oo j — — oo 

oo oo 

= E (-l)^e i( ^ j) *P(X = £- j)(-l) j e ijt P(Y' = j) 



l= — oo ] = — oo 



= E (-^ j e ijt P(Y' = j) J2 (-l)^" j e i(£ " j)t f(X = l-j) 

j=— oo l= — oo 
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Now, 



(-ly-'eW-MPiX =£-j) 

e=-<x 

oo 

= E (-l) e e iH P{X = 1) 

£=-oo 
r 

= ^{-lfe m P{X = I) 
e=o 

(Since X can only be 0, . . . , r) 
/ \ / \ £ / i 



I) \l+pj \l+P 



1 

Thus continuing our previous calculation, 



(1 — pe ) r by the Binomial theorem 



oo 1 



j = -oo 

o 



(since F' can only be — r, . . . , 0) 

= ^(-i). e ^P(y = -,)^ ) -(i-^ r 

(since Y' = —Y ) 

= £yye- ijt p(Y=j)- ( ^(i-pe i r 

r 

Now, similar to what we did before, we can derive that ^ ( — l) : 'e~' i: ' t P(Y = j) = ^+ p y (1 — pe~ lt ) r and 
therefore 

^)=(TT^( 1 -^ t ) r ( 1 -P e " it ) r 

□ 

Theorem K.2. (Restatement and proof of Theorem 5.15). A bounded row vector x = (. . . , X-2, X-i, 
Xq, Xi, x 2 , . . • ) belongs to rowcone({2Hc7vs(p.r)}) if f or °^ integers k, 



Vfc: J2(- i yfB[^^r)x k+j >0 
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where p and r are the parameters of the differenced negative binomial distribution and /g(-;p/(l+p), r) is the 
probability mass function of the difference of two independent binomial (not negative binomial) distributions 
whose parameters are p/(l + p) (success probability) and r (number of trials). 

Proof. For convenience, define the function h as follows: 

h(j) = i-iyfB^^r 

Let gNs(-;p,r) be the probability distribution function for the difference of two independent NB(p, r) ran- 
dom variables. Then the matrix representation M DNB ^ PjT .) of the differenced negative binomial mechanism 
^DNB(p.r) is the matrix whose rows and columns are indexed by the integers and whose entries are defined 
as: 

entry of M DNB{Ptr) = g NB (i - j;p,r) 

By Theorem 5.1 wc need to show that M DW s(p.r) is the inverse of jj^^H where H is the matrix whose 
rows and columns are indexed by the integers and whose entries are defined as: 

(i,j) entry of H = h(i - j) = {-l) l ~ 3 f B (i - j; jq^, r 
(to see how Theorem 5.1 is applied, note that each entry of the product xH has the form ( — Ip'/s (ii 1+^' r ) x k+j)- 



Now, to show that M DNB ^ pr } and j^^L^H are inverses of each other, we note that 



j=-r 

(i-p) 2 

entry of (M DNB{p ^ r) H) 

CO 

= 9nb(i- £;p,r)h(£- j) 



t=-a 
oo 



= E 9N B (i-j-£';P,r)h(£') 

£'=-oo 
r 

= Yl 9NB(i-j-l';p,r)h(£') (20) 

l'=-r 

The last step follows from the fact that f B (£';p,r) and h{£') are nonzero only when £' is between — r and r 
since f B (-;p, r) is the probability mass function of the difference of two binomial random variables (each of 
which is bounded between and r). 

Now, Equation 20 is the definition of the convolution [35] of gNB{-,P,r) and h at the point i — j. That 

is, 

r 

(g NB (-;p,r)*h)(k) = ^ g NB (k - £';p,r)h(£') 

l' = -r 

and thus to show that M D 7vs(p.r) an d jyz^tfH are inverses of each other, we just need to show that the 

convolution of gN B (-;p,r) and h at the point is equal to ^ 1+ ^j 2r and that the convolution at all other 
integers is 0. In other words, we want to show that for all integers k, 



(i~pT 

(l+p) 2 



(g NB (-;p,r)*h)(k) = K - ■ P > 5{k) (21) 
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where 5 is the function that 6(0) = 1 and 5(k) = for all other integers. Take the Fourier series transform 

^ oo 

of both sides while noting two facts: (1) the Fourier series transform of 6 is 6(t) = 6(£)e M = 1, and (2) 

£=-oo 

the Fourier transform of a convolution is the product of the Fourier transforms [35] . Then the transformed 
version of Equation 21 becomes 

g7h(t)h(t) = ) t>s(t) = ) %r (22) 

y v ; v ; (1 +p) 2r v ' {l+p) 2r K ' 

for all real t, where c/aFb, h, 6 are the Fourier series transforms of 57vs(';P7 r )i h> an( i <^> respectively. Once 
we prove that Equation 22 is true, this implies Equation 21 is true (by the inverse Fourier transform) which 

then implies that M DNB (, p r ) and ^^| 2r i? are inverses of each other and this would finish the proof (by 
Theorem 5.1). 

Thus our goal is to prove Equation 22. The Fourier series transform (i.e. characteristic function), as a 
function of t, of the NB(p, r) distribution is known to be: 



1 — pe 

so fl'jvsCsP) r )i being the difference of two independent negative binomial random variables, has the Fourier 
series transform (as a function of t) 



gNB{t) 

By Lemma K.l, 



\-V V / 1 



pe lt J \ 1 — pe~ lt 



Thus Equation 22 is true and we are done. □ 
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